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Abstract — Estimating intrinsic dimensionality of data is 
a classic problem in pattern recognition and statistics. 
Principal Component Analysis (PCA) is a powerful tool 
in discovering dimensionality of data sets with a linear 
structure; it, however, becomes ineffective when data have 
a nonlinear structure. In this paper, we propose a new 
PCA-based method to estimate intrinsic dimension of data 
with nonlinear structures. Our method works by first find- 
ing a minimal cover of the data set, then performing PCA 
locally on each subset in the cover and finally giving the 
estimation result by checking up the data variance on all 
small neighborhood regions. The proposed method utilizes 
the whole data set to estimate its intrinsic dimension and is 
convenient for incremental learning. In addition, our new 
PCA procedure can filter out noise in data and converge 
to a stable estimation with the neighborhood region size 
increasing. Experiments on synthetic and real world data 
sets show effectiveness of the proposed method. 

Index Terms — Pattern recognition; Principal component 
analysis; Intrinsic dimensionality estimation. 



I. Introduction 

Intrinsic dimensionality (ID) of data is a key priori 
knowledge in pattern recognition and statistics, such as 
time series analysis, classification and neural networks, 
to improve their performance. In time series analysis 12, 
the domain of attraction of a nonlinear dynamic system 
has a very complex geometric structure, and study on the 
geometry of the attraction domain is closely related to the 
fractal geometry. Fractal dimension is an important tool 
to characterize certain geometric properties of complex 
sets. In neural network design J2), the number of hidden 
units in the encoding middle layer should be chosen 
according to the ID of data. In classification tasks 0, 
in order to balance the generalization ability and the 
empirical risk value, the complexity of the function 
should also be related to the ID of data. 
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Recently, manifold learning, an important approach 
for nonlinear dimensionality reduction, has drawn great 
interests. Important manifold learning algorithms include 
isometric feature mapping (Isomap) (H, locally linear 
embedding (LLE) [5] and Laplacian eigenmaps (LE) 01 . 
They all assume data to distribute on an intrinsically low- 
dimensional sub-manifold [7!] and reduce the dimension- 
ality of data by investigating the intrinsic structure of 
data. However, all manifold learning algorithms require 
the ID of data as a key parameter for implementation. 

Previous ID estimation methods can be categorized 
mainly into three groups: projection approach, geomet- 
ric approach and probabilistic approach. The projection 
approach I^- lfTTl finds ID by checking up the low- 
dimensional embedding of data. The geometric method 
|[22l finds ID by investigating the intrinsic geometric 
structure of data. The probabilistic technique |[T9l builds 
estimators by making distribution assumptions on data. 
These approaches will be briefly introduced in Section 

m 

In this paper, we propose a new PCA-based method 
for ID estimation which is called the C-PCA method. 
The proposed method first finds a minimal cover of the 
data set, and each subset in the cover is considered as 
a small subregion of the data manifold. Then, on each 
subset, a revised PCA procedure is applied to examine 
the local structure. The revised PCA method can filter 
out noise in data and leads to a stable and conver- 
gent ID estimation with the increase of the subregion 
size, as shown by the experimental results. This is an 
advantage over the traditional PCA method which is 
very sensitive to noise, outliers and the choice of the 
subregion size. Further analysis shows that the revised 
PCA procedure can efficiently reduce the running time 
complexity and utilize all data samples for ID estimation. 
We should remark that our ID estimation method is also 
applicable to incremental learning for consecutive data. 
Our method is compared with the maximum likelihood 
estimation (MLE) method |[T9l , the manifold adaptive 
method (which is referred to as the k-k/2 NN method 
in this paper) lfT8ll and the k-nearest neighbor graph (k- 
NNG) method [26]], [27] through experiments. 

The rest of the paper is organized as follows. In 
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Section [TTJ previous ID estimation methods are briefly 
reviewed. In Section JIIJ the new ID estimation method 
(C-PCA) is introduced. In Section |IVJ experiments are 
conducted on synthetic and real world data sets to show 
the effectiveness of the proposed algorithm. Conclusion 
is made in Section [V] 



II. Previous algorithms on ID estimation 

Previously, there are mainly three approaches to es- 
timate the ID of data: projection, geometric and proba- 
bilistic approaches. 

The projection approach first projects data into a 
low-dimensional space and then determine the ID by 
verifying the low-dimensional representation of data. 
PCA is a classical projection method which finds ID by 
counting the number of significant eigenvalues. However, 
the traditional PCA only works on data lying in a linear 
subspace but becomes ineffective when data distribute on 
a nonlinear manifold. To overcome this limitation, local- 
PCA O and OTPMs PCA flTUl have been proposed and 
can discover the ID of data lying on nonlinear manifolds 
by performing the PCA method locally. The Isomap 
algorithm yields ID of data by inspecting the elbow of 
residual variance curve |4). Cheng et al. gave an efficient 
procedure to compute eigenvalues and eigenvectors in 

PCA m. 

Geometric approaches make use of the geometric 
structure of data to build ID estimators. Fractal-based 
methods have been well developed and used in time 
series analysis. For example, the correlation dimension 
(a kind of fractal dimensions) was used in lfl"3l to 
estimate the ID, whilst the method of packing numbers 
was proposed in lfl4Tl to find the ID. Other fractal- 
based methods include the kernel correlation method 
||23l and the quantization estimator Ell . A good survey 
on fractal-based methods can be found in |[22l . There 
are also many methods based on techniques from com- 
putational geometry. Lin lfl"5l and Cheng |[25l suggested 
to construct simplices to find the ID, while the nearest 
neighbor approach uses the distances between data points 
with their nearest neighbors to build ID estimators such 
as the estimator proposed by Pettis et al. ||29l , the k- 
NNG method US, (27l and the incising ball method 
[17]. A comparison of the local-PCA method with that 
introduced by Pettis et al. was made in ifToll . 

Probabilistic methods are based on probabilistic as- 
sumptions of data and have been tested on various 
data sets with stable performance. The MLE-method 
Ifl9l is a representative method of this approach, whose 
final global estimator is given by averaging the local 
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where T k (xi) is the distance between and its £>fh 
nearest neighbor. MacKay and Ghahramani |20] pointed 
out that compared with averaging the local estimators 
directly, it is more sensible to average their inverses 



dl 



1. 



N for the maximum likelihood 



purpose. The recommended final estimator is 



1 ^ 



i=i 

where dk is the estimated ID corresponding to the 
neighborhood size k. 

III. ID ESTIMATION USING PCA WITH COVER SETS: 

C-PCA 

Basically, there are two kinds of definitions of ID 
that are commonly used. One is based on the fractal 
dimension, such as the Hausdorff dimension and the 
packing dimension that are usually real positive numbers. 
The other kind of definition is based on the embedding 
manifold whose ID is always an integer. 

Definition 3.1 (Embedding manifold and dimension): 
Let d < D and let O be a compact open set in M. d . 
Assume that span {f2 - J Q d^} = W 1 and : O — » M D 
is a smooth function. The set X = is called an 

embedding manifold with d its embedding dimension. 

More and more real world data are proved to have 
nonlinear intrinsic structures and may possibly distribute 
on nonlinear embedding manifolds Q. Therefore, esti- 
mation of embedding ID of data becomes an important 
problem If 1 71 . In this paper, we focus on estimation of 
embedding dimensions. 

A. PCA-based methods for ID estimation 

The traditional PCA can find a subspace on which 
data projections have maximum variance. Given a data 
set X = {xi, ■ ■ ■ ,xn} with X{ € M D . Let X = 
[x\, ■ ■ ■ , xjsr] and x = Ylt=l Xi ~ ^ e covariance matrix 
of X is given by 

N 



C = — Y^{xi -x)(xi -xf 



i=l 



Since C is a positive semi-definite matrix, we can assume 
that Ai > A2 > ••• > Ayv > Oare the eigenvalues 
of C with ui,--- ,vn the corresponding orthonormal 
eigenvectors, respectively. The eigen-decomposition of 
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matrix C is denoted as C = FDT T ', where D is a 
diagonal matrix with Da = Aj and T = [v\,--- ,vn]. 
The eigenvector Vi is the i-th principal direction (PD) 
and, for any variable x, yi = vix is defined as the 
i-th principal component (PC). By the definition, we 
have the variance var(yj) = \ and the covariance 
cov (yi,yj) = 0. 

If the data set X distributes on a linear subspace, then 
the d primary PDs should be able to span the subspace 
and the corresponding PCs can account for most of 
the variations contained in X. On the other hand, the 
variance of PCs on f^+i, • • • , v$ (i.e., the PDs which 
are orthogonal to the linear subspace of dimension d) 
will be trivial. The most commonly-used criterions for 
ID estimation with the PCA method are 



min (var(yj)) 

i=l,— ,d 

max (var( yi )) 

j=d+l,--- ,N 



> a > 1 



(1) 



and the percentage of the accounted variance 



E£ivar( yi ) 



>/?, 



< p < 1. 



(2) 



In this paper, the ID, d, is determined if the condition 
(0Q) or © is satisfied. 

B. Filtering out the noise of data 

There are two challenges for PCA-based ID estimation 
methods. The first one is how to filter out the noise 
in data, while the second one is how to choose the 
size of subregions on the manifold. Previously, the ID 
estimation of data obtained with PCA-based methods 
always increases with the size of subregions so the 
methods can not converge to give a stable ID estimation. 
In order to address these two limitations, we propose the 
following noise filtering procedure which can efficiently 
filter out the noise in data and make PCA-based methods 
to converge. 

Consider the effect of additive white noise \x in data 
with E(fi) = and var(//) = a 2 . The covariance matrix 
of the noise corrupted data is given by 

C = var(X + fi) = C + a 2 1, 

where C is the covariance matrix of the data X. It can 
be seen that the PDs of C are identical to those of C 
and the eigenvalues of C are A^ = Aj + a 2 . If a is 
relatively large, then the ID criterions £T|) and (f2]) will 
be ineffective. 

The variance of data projections on the PDs that are 
orthogonal to the intrinsic embedding subspace is very 
small, and the most part of the variance is produced by 



noise. Therefore, it is possible to calculate the variance 
of noise by projecting data on the orthogonal PDs. Given 
a real number P which is very close to 1 (P is taken to 
be 0.95 in this paper), the noise part of data is determined 
by 

S>M<P and EUvar(^) >p 
ELvarG/,) £ 4 =ivar(y 4 ) 

Thus, the variance of noise contained in data can be 
estimated as 
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Our new ID estimation criterions make use of the up- 
dated variance on PDs: var(yj) = Aj — a 2 . 

Remark 3.1: Noise is typically different from outliers. 
Noise affects every data points independently, while 
outliers are referred to data points that are at least at 
a certain distance from the data points on manifold. The 
proposed procedure is very robust to both noise and 
outliers, as shown in experiments. On the other hand, 
the traditional PCA procedure can handle limited noise 
but is very sensitive to outliers. 

C. The local region selection method 

An embedding manifold can be approximated locally 
by linear subspaces. The dimensionality of each linear 
subspace should be equal to the ID of the embedding 
manifold. Therefore, it is possible to estimate the ID of 
a nonlinear manifold by checking it locally. A cover is 
referred to a set whose elements are subsets of the data 
set satisfying that the union of all subsets in the cover 
contains the whole data set. 

Definition 3.2 (The set cover problem): Given a uni- 
verse X of N elements and a collection F of subsets of 
X, where F = {F\, ■ ■ ■ ,-F/v}. Set cover is concerned 
with finding a minimum sub-collection of F that covers 
all data points. 

Using a minimum cover has two advantages. First, 
it can find the minimal number of subregions, which 
helps save the computational time. Secondly, the result 
of ID estimation that utilizes the whole data set is more 
reliable. However, searching such a minimal cover is 
an NP-hard problem. In the following, we introduce an 
algorithm which can approximately find a minimal cover 
of a data set. 

Given the parameter, an integer A; or a real number e, 
there are two ways to define the neighborhood of any 
data point x: 

1) The fc-NN method: any data point X{ that is one of 
k nearest data points of x is in the neighborhood of 

x; 
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2) The e-NN method: any data point X{ in the region 

{y : \\y — x\\ < e} is in the neighborhood of x. 
Without loss of generality, we may assume that the 
index of data points is independent of their locations. 

Algorithm 1 (Minimum set cover algorithm) 
Input: Neighborhood size k (integer) or e (real number), 
distance matrix D = {\\x- L — Xj\\) 

Output: Minimum cover F = {(Fi,ri),i = 
I,--- ,S}. 
1: for i=l to N do 

2: Identify the neighbors {x^ , • • • , £j P . } of x\ by the 
fc-NN or e-NN method. Let F t = ■ ■ ■ ,ip z } 

be the index set of the neighborhood and let D 
be the — 1 incidence matrix. 

3: end for 

4: Let F = {(Fi,n = 0),t = 1,-.. ,N} 
5: for i = 1 to TV do 

6: Let the frequency of Xj be computed by Qi = 

T N D 
7: end for 

8: for i = 1 to N do 
9: if QuQi ± ,- - ,Qi P . > 1 then 
10: Remove (i*i,rj) from the cover set F and set 

i — Qi L Qi\ — Qi\ 1> ' ' ' i Qip — 

Q iPi - 1. 
11: else 

12: Let TV = max IIxj — Xj.ll 

13: end if 
14: end for 



Using the above approximation algorithm, a cover 
F = {(Fi,ri),i = 1, • • • ,5} of the data set A* can 
be found. Compared with the local region selection 
algorithm used in [|9], our algorithm above has a low 
time complexity and avoids the supervised process to 
choose the neighborhood. Intuitively, the cardinality S 
of the cover F satisfies that N/k < S < N/2k, where 
k is the average number of neighbors. 

D. The proposed ID estimation algorithm 

We now present the proposed ID estimation algorithm 
using local PCA on the minimal set cover: the C-PCD 
algorithms, which are summarized below for both batch 
and incremental data, respectively. 

In many cases, consecutive data are collected incre- 
mentally. This requires an incremental learning algorithm 
to inspect the change of the data structure on time. The 
incremental C-PCA algorithm is presented as follows. 

Remark 3.2: Our method is different from the Local- 
PCA [9] in many aspects. First, the centers and the 



Algorithm 2 (The C-PCA algorithm for batch data) 
Step 1. Given a parameter k or e, compute 
a minimal cover of X by Algorithm IIII-CI 
Without loss of generality, F = {(Fj,rj) : 
i = 1, • • • , S} is assumed to be the constructed 
minimal set cover. 

Step 2. Perform the PCA algorithm proposed 
in Subsections IIII-Al and IIII-Bl on subsets Fi, 
i = 1 • • • ,S. The local ID estimations {di}f =l 
are then obtained. 

Step 3. Let Xij be the j-th eigenvalue on the i- 
th subset in the decreasing order. Xj = ^ Xij 
is considered as the variance of X on its j-th 
PD. Subsequently, the global ID estimation d 
can be derived using the criterions £[]) or 



Algorithm 3 (The incremental C-PCA algorithm) 

Step 1. The new data point is assumed to be x. 
Let {x±, • • • , xs} be the centers of the subsets 
in the cover. Find the nearest center x q of x: 

x = are min \\x — xA\. 

H i=l,-,S 

Step 2. If \\x q — as || > r q , then the data point 
x is considered as an outlier and the remaining 
part of the algorithm will not be performed on 
x. Otherwise, go to Step 3. 
Step 3. Performs PCA on F q = F q \J{x}. Let 
X' q j be the j-th eigenvalue. Update Xj by Xj = 

x 3 + Kj - X ir Then let X d = Kr 

Step 4. Update the local ID, d q , and the global 

ID, d, of X. 



local regions are determined simultaneously by using 
one parameter - the neighborhood size, whilst, in @, the 
centers and neighborhood sizes are determined by two 
parameters. Secondly, our approach finds the subregions 
by approximating a minimum cover of the data set, while 
the local-PCA in [9} does not guarantee whether or not 
the selected subregions cover the whole data set. 

E. Computational complexity analysis 

The computational complexity of our algorithms is 
one of the most important issues for its application. 
The batch mode ID estimation can be divided into two 
parts. In the first part, computing the distance matrix 
needs 0(N 2 ) time, searching the nearest neighbors for 
every data point needs 0(kN 2 ) time and finding an 
approximate minimum cover of X needs O(kN) time. 
Therefore, the first part needs 0((K + 1)N 2 + kN) 
running time. In the second part, performing PCA locally 
needs k 3 x (N/k) pa 0(k 2 N) running time. To sum 
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up, the total running time needed for the batch mode 
algorithm is 0((k + l)N 2 + (k 2 + k)N). If the proposed 
method is embedded in a manifold learning algorithm, 
then the running time complexity can be reduced to 
0((k 2 + k)N) in the case when the distance matrix and 
the neighborhood are already defined. This is a relatively 
small increase in the time complexity of a manifold 
learning algorithm which is always as high as 0(N 3 ). 

For incremental learning, the neighborhood identifica- 
tion step needs O(Nfk) running time, whilst the local 
PCA consumes 0((k + l) 3 ) running time. Therefore, 
the total time complexity for incremental learning is 
0((k + l) 3 + N/k). 

IV. Experiments 

The proposed algorithm was implemented with param- 
eters a = 10 and (3 = 0.8 for all the experiments. 

In practice, it is found that noise contained in data is 
of low-dimension, except an additive white noise which 
is assumed to be in every component of the data vectors 
in W D . Thus, in practice, we only use variances of the 
first min(10, N — r + 1) PCs in the noise part of data to 
estimate the variance of noise (see Eq. (O). 

Comparison is made among the k-k/2 NN method 
llT8l . the k-NNG method (261, the revised MLE (MLE 
in short) method [20]], the C-PCA method and the L-PCA 
method, where the L-PCA method stands for the C-PCA 
method without the noise filtering procedure proposed 
in Subsection IIII-BI It should be noted that the results 
obtained by the MLE, k-k/2 NN and /c-NNG methods 
are positive real numbers, while the L-PCA and C-PCA 
methods produce only integer ID estimations. In order 
to make a comparison among these results, we average 
the local ID estimations obtained with the C-PCA and 
L-PCA methods to provide a real ID estimation: d = 

S Zji=1 



A. 10-Mobius data 

The first data set is a 10-Mobius ring embedded in 
R 3 . Fig. [TJa) shows the scatter plot of the Mobius ring 
data set. As can be seen, the Mobius data points are 
lying on a highly nonlinear manifold with 1200 points 
uniformly distributing on the surface. Fig. [TJb) shows 
the results obtained by the five ID estimation algorithms 
against the neighborhood size ranging from 4 to 40. The 
MLE method is the most stable and accurate algorithm 
for all neighborhood sizes. All algorithms converge to 
the correct estimation. It seems that the L-PCA method 
does not diverge on this data set. This is possibly because 
the original dimensionality of data is low. 




Fig. 1. (a) shows the scatter plot of the Mobius ring data set, 
and (b) shows the ID estimation results corresponding to the size of 
subregions. 



B. Real world data sets 

Our algorithm is compared with the MLE, k-k/2 NN 
and A;-NNG methods on some benchmark real world data 
sets: the Isoface data set (4j, the LLEface data set @ 
and the MNIST '0' and '1' data sets ll28l . 

The Isoface data set is comprised of 698 images of 
a head with the resolution 64 x 64. Some samples of 
the Isoface data set are shown in Fig. |2fa). In the exper- 
iments, each image is reshaped to a 4096-dimensional 
vector. It can be seen that the Isoface data set is under 
a three-dimensional movement: up-down, left-right and 
lighting changes. In (31, the Isomap algorithm estimated 
its ID as 3 using the projection approach. As can be 
seen from Fig. 12b), corresponding to the neighborhood 
sizes from 4 to 40, the C-PCA estimator ranges from 
2.3 to 3.5 and the MLE estimator ranges from 3.5 to 
4.5. The estimation given by the /c-NNG and k-k/2 
NN methods is oscillating badly with the neighborhood 
sizes, so they are bot unstable. Since the L-PCA method 
can not filter out noise contained in data, it tends to 
overestimate the ID as the neighborhood size increases. 
This means that our noise filtering process plays a key 
role in the convergence of the C-PCA method. 

The second data set is the LLEface data set, which 
contains 1965 samples in a 560-dimensional space (see 
Fig. (3ja) for some samples). From Fig. [3jb), it is seen 
that both the C-PCA and the MLE methods give a 
convergent ID estimation with the increase of the neigh- 
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35 40 



Fig. 2. (a) shows some samples of the Isoface data set. As can be seen, a head is under left-right, up-down and lighting changes, (b) 
presents the estimated ID of the Isoface data set. 




Fig. 3. (a) shows some samples of the LLEface data set and (b) plots the estimated ID of the LLEface data set against the neighborhood 
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'(b) 



Fig. 4. (a) shows some samples of '0' in the MNIST data set and (b) gives the plot of the estimated ID of data '0' versus the neighborhood 
size. 



borhood size, while the L-PCA, k-k/2-NN and fc-NNG 
methods seem not convergent when the neighborhood 
size is increasing. The ID estimation given by the C-PCA 
method changes between 2.8 and 4.7 with the convergent 
estimation being 4.7, while the estimation result obtained 
by the MLE method changes gradually from 5.2 to 5.8 
with a convergent estimation of 5.8. 

We now consider two MNIST data sets: the set '0' and 
the set T (see Fig. Ua) and Fig.[5]X) for some samples 
of these two data, respectively). The data set '0' contains 



980 data points, while the data set '1' contains 1135 data 
points. It can be seen from Fig. Ub) and Fig. [5£b) that 
all methods, except the L-PCA and fc-NNG methods, 
converge with the increase of the neighborhood size. For 
the data set '0', it can be seen from Fig. HJb) that the ID 
estimation given by the C-PCA method converges to 5.8 
and the estimation given by both the MLE estimator and 
the k-k/2-NN estimator converges to 10. For the data set 
' 1 ' , Fig. f5Jb) shows that the ID estimation obtained by 
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Fig. 5. (a) shows some samples of '1' in the MNIST data set and (b) presents the plot of the estimated ID of data '1' versus the neighborhood 
size. 



the C-PCA method converges to 5.5 and the estimation 
provided with both the MLE method and the k-k/2-NN 
method converges to 7.2. Note that the result given by 
our method is in a big disagreement with the results 
given by other methods for the ID estimation of the data 
sets '0' and '1'. A digit '0' is usually represented as an 
ellipse which can be determined by the coordinates of 
its focus and its major and minor axes, so the ID of the 
data set '0' is likely to be 5. The number '1' can be 
considered as a line segment, which rotates from left to 
right, so a sensible ID estimation for the data set '1' may 
be between 4 and 5. 



C. Noisy data sets 

The traditional PCA algorithm is very sensitive to 
outliers, and the performance of PCA-based algorithms 
deteriorate rapidly if data points are sparse on a manifold 
such as the hand rotation data set Lj. As can be seen 
from Fig. [3a), the hand is under a one-dimensional 
movement, so the data points can be considered as lying 
on a one-dimensional curve. The data set contains 481 
image samples, and each sample is a vector in a 512480- 
dimensional space. Many outliers can be seen from its 
low-dimensional embedding by the Isomap algorithm 
(see Fig. Ob)). Its ID estimation results with different 
methods are shown in Fig. [3c). 

Both the k-k/2 NN and fc-NNG methods are sensitive 
to the choice of the neighborhood size and tend to 
overestimate the ID as the neighborhood size increases. 
On the other hand, the MLE estimator is more stable (see 
Fig. Oc)). However, the minimum estimation of MLE 
method is 1.75, which is still higher than the ID of this 
data set. L-PCA method has the worst performance due 
to the outliers contained in the data set. The estimation 

'CMU database: http://vasc.vi.cmu.edu/idb/html/motion/hand/index 
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Fig. 6. (a) shows selected samples of the hand rotation data set, (b) 
shows the low dimensional embedding of hands rotation data sets by 
Isomap algorithm, (c) ID estimations of the hands rotation data set. 



of the C-PCA method, which changes between 1.5 and 
1.2, is the closest one to the correct ID of this data set. 

We now transform the original 10-Mobius data in a 
4-dimensional space using an Euclidean transformation. 
A random noise with mean and variance 0.2 is added 
to the transformed data. The ID estimation results with 
different algorithms are given in Fig. [7] As can be seen 
from Fig. [7J the ID estimation given by the C-PCA 
method is the closest one to the correct ID of this 
noised 10-Mobius data set. The other algorithms tend to 
overestimate the ID of the noised data set. The estimation 
obtained by the L-PCA method is a little higher than that 
htrglven by the C-PCA due to the effect of noise. 
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Fig. 7. ID estimations of the noised Mobius data set. 



V. Conclusion 

In this paper, we proposed a new ID estimation 
method based on PCA. The proposed algorithm is simple 
to implement and gives a convergent ID estimation corre- 
sponding to a wide range of neighborhood sizes. It is also 
convenient for incremental learning. Experiments have 
shown that the new algorithm has a robust performance. 
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